A Reasoning And Hypothesis-Generation Framework Based On Scalable Graph Analytics Enabling Discoveries In Medicine Using Cray Urika-XA And Urika-GD
نویسندگان
چکیده
Finding actionable insights from data has always been difficult. As the scale and forms of data evolve and morph, the task of finding value becomes even more challenging. Addressing, this challenge, data scientists at Oak Ridge National Laboratory are leveraging unique leadership infrastructure (e.g. Urika-XA and Urika-GD appliances) to develop scalable algorithms for semantic, logical and statistical reasoning with unstructured Big Data. In this paper, we present the deployment of such a framework called ORIGAMI (Oak Ridge Graph Analytics for Medical Innovations) on the National Library of Medicine’s Semantic Medline (archive of medical knowledge since 1994). Medline contains over 70 million knowledge nuggets published in 23.5 million papers in medical literature with thousands more added each year. ORIGAMI is available as an open-science medical hypothesis generation tool both as a web-service and an application programming interface (API) at http://hypothesis.ornl.gov . In 2015, ORIGAMI was featured in the Historical Clinical Pathological Conference in Baltimore as a demonstration of artificial intelligence to medicine and recognized as a Centennial Showcase Exhibit at the Radiological Society of North America (RSNA) Conference in Chicago. This paper describes the workflow built using the Cray Urika-XA and Urika-GD appliances that enables reasoning with the knowledge of every published medical paper every time a clinical researcher uses the ORIGAMI tool. Since becoming an online service, ORIGAMI has enabled clinical subject-matter experts to: (i) hypothesize the relationship between betablocker treatment and diabetic retinopathy; (ii) discover that xylene is an environmental cancer-causing carcinogen and (iii) aid doctors with diagnosis of challenging cases when rare diseases manifest with common symptoms. Keywords-component; natural language reasoning; scalable graph analytics; hypothesis generation; semantic reasoning
منابع مشابه
Experiences Running and Optimizing the Berkeley Data Analytics Stack on Cray Platforms
The Berkeley Data Analytics Stack (BDAS) is an emerging framework for big data analytics. It consists of the Spark analytics framework, the Tachyon in-memory filesystem, and the Mesos cluster manager. Spark was designed as an in-memory replacement for Hadoop that can in some cases improve performance by up to 100X. In this paper, we describe our experiences running BDAS on the new Cray Urika-XA...
متن کاملYarcData's uRiKA Shows Big Data Is More Than Hadoop and Data Warehouses
The hype about big data is mostly on Hadoop or data warehouses, but big data involves a much wider and varied set of needs, practices and technologies. We offer recommendations for IT organizations seeking a solution to "graph" problems, including use of the uRiKA graph appliance. Impacts ■ IT organizations faced with previously infeasible graph-style discovery problems may succeed using a focu...
متن کاملToward a Scalable Bank of Filters for High Throughput Image Analysis on the Cray Urika-GX System
High throughput image analysis is critical for experimental science facilities and enables one to glean timely insights of the various experiments and to better understand the physical phenomena being imaged. We present the design and evaluation of banks of filters, the core building blocks for high throughput image analysis, on the Cray Urika-GX system. We describe our infrastructure developed...
متن کاملTowards Seamless Integration of Data Analytics into Existing HPC Infrastructures
Customers of the High Performance Computing Center (HLRS) tend to execute more complex and data-driven applications, often resulting in large amounts of data of up to 1 Petabyte. The majority of our customers, however, is currently lacking the ability and knowledge to process this amount of data in a timely manner in order to extract meaningful information. We have therefore established a new p...
متن کاملThe battery for assessment of clinical reasoning in the Olympiad for medical sciences students
Clinical reasoning is not only a critical skill in medicine, but also central to the clinical practice. Considering that there is no method of assessing clinical reasoning based on the theoretical framework of medical expertise research, we could approach assessment in an innovative way taking the model of clinical reasoning as a guide. In this model three major components of clinical reasoning...
متن کامل